Covering a Circular String with Substrings of Fixed Length
نویسندگان
چکیده
A nonempty circular string C(x) of length n is said to be covered by a set U k of strings each of xed length k n ii every position in C(x) lies within an occurrence of some string u 2 U k. In this paper we consider the problem of determining the minimum cardinality of a set U k which guarantees that every circular string C(x) of length n k can be covered. In particular, we show how, for any positive integer m, to choose the elements of U k so that, for suuciently large k, u k k?m , where u k = jU k j and is the size of the alphabet on which the strings are deened. The problem has application to DNA sequencing by hybridization using oligonucleotide probes.
منابع مشابه
De Bruijn Sequences for Fixed-Weight Binary Strings
De Bruijn sequences are circular strings of length 2n whose length n substrings are the binary strings of length n. Our focus is on creating circular strings of length (
متن کاملFixed-density De Bruijn Sequences
De Bruijn sequences are circular strings of length 2 whose substrings are the binary strings of length n. Our focus is on de Bruijn sequences for binary strings that have the same density (number of 1s). We construct circular strings of length ( n−1 d )
متن کاملAn Efficient Algorithm for Finding Similar Short Substrings from Large Scale String Data
Finding similar substrings/substructures is a central task in analyzing huge amounts of string data such as genome sequences, web documents, log data, etc. In the sense of complexity theory, the existence of polynomial time algorithms for such problems is usually trivial since the number of substrings is bounded by the square of their lengths. However, straightforward algorithms do not work for...
متن کاملGenerating Necklaces and Strings with Forbidden Substrings
Given a length m string f over a k-ary alphabet and a positive integer n, we develop eecient algorithms to generate (a) all k-ary strings of length n that have no substring equal to f, (b) all k-ary circular strings of length n that have no substring equal to f, and (c) all k-ary necklaces of length n that have no substring equal to f, where f is an aperiodic necklace. Each of the algorithms ru...
متن کاملA Parallel Algorithm for the Fixed-length Approximate String Matching Problem for High Throughput Sequencing Technologies
The approximate string matching problem is to find all locations at which a query of length m matches a substring of a text of length n with k-or-fewer differences. Nowadays, with the advent of novel high throughput sequencing technologies, the approximate string matching algorithms are used to identify similarities, molecular functions and abnormalities in DNA sequences. We consider a generali...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Int. J. Found. Comput. Sci.
دوره 7 شماره
صفحات -
تاریخ انتشار 1996